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DETAILED ACTION 

Claim Rejections - 35 USC § 103 

1 . The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

2. Claims 6, 7, 9, 15, 16, 18, and 26 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Kuhn et a/, in view of Kanevsky et a/. 

Concerning independent claims 6 and 15, Kuhn et a/, discloses a method, 
system, and computer program, comprising: 

"receiving digitized voice data from a user" - speech input supplied through 
microphone 26 is first digitized (column 3, lines 66 to 67: Figure 2); 

"processing the voice data to determine one or more phrases recognized as the 
digitized voice data provided by the user based on a currently active recognition 
grammar" - the output of speech recognizer module 40 is supplied to the natural 
language parser 42 working in conjunction with a set of goal oriented grammars 44 
(column 3, line 66 to column 4, line 10: Figure 2); in some instances, the natural 
language parser will immediately identify a program the user is interested in watching, 
but in other instances, there may be multiple choices or possibilities (column 4, lines 38 
to 54: Figure 2); the set of grammars have context-sensitive grammar rules for each 
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topic, e.g. grammar A 240 and grammar B 242 ("a currently active recognition 
grammar") (column 6, lines 50 to 65: Figure 4); 

"when one or more phrase is recognized as the digitized voice data provided by 
the user as a result of voice-recognition uncertainty, using user-specific context 
information to choose a recognized phrase from the one or more phrases recognized as 
the digitized voice data" - automatic speech recognition process block 217 generates 
word confidence vector 268 which indicates how well words in input sentence 218 were 
recognized ("voice-recognition uncertainty"); dialog manager 130 generates dialogue 
context weights 269 by determining the state of the dialog by asking the user about a 
particular topic; due to this request, dialog manager 130 determines what the user said 
(column 7, lines 18 to 29: Figure 4); the dialog manager has a user profile data store 56 
("user-specific context information"), which stores information about the user's previous 
information selections; thus, this data store helps the dialog manager tune its prompts to 
best suit the user's expectations (column 4, lines 48 to 54: Figure 2); N-best processor 
270 selects the highest-scoring candidate as what the user intended (column 7, lines 59 
to 64: Figure 4). 

Concerning independent claims 6 and 15, Kuhn et al. omits an elimination 
procedure to select a final phrase, but Kanevsky et al. discloses a method, system, and 
computer program, comprising: 

"selecting elements of uncertainty within the one or more recognized phrases" - 
as each ambiguity is encountered, recognition is suspended to allowing presenting 
questions to the user to discriminate between potential selection classes; an 
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intermediate question is posed to discriminate between "meet at heaven" and "meet at 
seven" (column 3, lines 27 to 39); 

"selecting the user-specific context information from a database based on the 
elements of uncertainty" - classification questions are posed based upon classification 
questions concerning space or time relationships, whether the phrase describes a noun, 
verb, or adjective, etc. (column 3, lines 40 to 63); potential final alternative classes may 
be selected to include a personal characteristics class profile ("user-specific context 
information") (column 4, lines 37 to 45); 

"eliminating phrases within the one or more recognized phrases based on the 
user-specific context information regarding the elements of uncertainty" - based on the 
user's response, intermediate decoding alternatives are narrowed, eliminating choices 
that are incongruous with the user's response (column 5, lines 4 to 9: Figure 2: Step 
132); 

"selecting a final phrase as the recognized phrase once all other phrases within 
the one or more recognized phrases are eliminated" - if all ambiguities have been 
resolved, then a final decoding output is produced using the narrowed set of 
intermediate decoding alternatives; otherwise, the procedure iterates (column 5, lines 8 
to 13: Figure 2: Step 134). 

Concerning independent claims 6 and 15, Kanevsky et a/, teaches a system and 
method for resolving decoding ambiguity via dialog has the advantage of improving 
language decoding performance and accuracy. (Column 1, Lines 50 to 53) It would 
have been obvious to one having ordinary skill in the art to utilize the system and 
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method for resolving decoding ambiguity to iteratively eliminate phrases until a final 
phrase is obtained as taught by Kanevsky et al. in the multi-modal dialog unit of Kuhn et 
al. for the purpose of improving language decoding performance and accuracy. 

Concerning claims 7 and 16, Kanevsky et al. teaches selection classes may 
include classification questions about space relationships (column 3, lines 40 to 63), 
corresponding to "location information", which is one of the enumerated alternatives. 

Concerning independent claim 26, Kuhn etai discloses a system, comprising: 
"a voice interface to receive digitized voice data from a user" - speech input 
supplied through microphone 26 is first digitized (column 3, lines 66 to 67: Figure 2); 

"a voice recognition engine processes the voice data to determine one or more 
phrases recognized as the digitized voice data provided by the user based on a 
currently active recognition grammar" - the output of speech recognizer module 40 is 
supplied to the natural language parser 42 working in conjunction with a set of goal 
oriented grammars 44 (column 3, line 66 to column 4, line 10: Figure 2); in some 
instances, the natural language parser will immediately identify a program the user is 
interested in watching, but in other instances, there may be multiple choices or 
possibilities (column 4, lines 38 to 54: Figure 2); the set of grammars have context- 
sensitive grammar rules for each topic, e.g. grammar A 240 and grammar B 242 ("a 
currently active recognition grammar") (column 6, lines 50 to 65: Figure 4); 
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"a database containing user context information" - the dialog manager has a 
user profile data store 56, which stores information about the user's previous 
information selections; thus, this data store helps the dialog manager tune its prompts to 
best suit the user's expectations (column 4, lines 48 to 54: Figure 2); 

"a user context natural language processor having a capability to select user- 
specific context information from a database and use the user-specific context 
information to choose a recognized phrase from the one or more phrases recognized as 
the voice data when the voice recognition engine recognized more than one phrase as 
the voice data provided by the user" - the output of speech recognizer module 40 is 
supplied to the natural language parser 42 (column 3, line 67 to column 4, line 2: Figure 
2); automatic speech recognition process block 217 generates word confidence vector 
268 which indicates how well words in input sentence 218 were recognized; the dialog 
manager has a user profile data store 56, which stores information about the user's 
previous information selections ("user-specific context information"); thus, this data store 
helps the dialog manager tune its prompts to best suit the user's expectations (column 
4, lines 48 to 54: Figure 2); dialog manager 130 generates dialogue context weights 269 
by determining the state of the dialog by asking the user about a particular topic; due to 
this request, dialog manager 130 determines what the user said (column 7, lines 18 to 
29: Figure 4); N-best processor 270 selects the highest-scoring candidate as what the 
user intended (column 7, lines 59 to 64: Figure 4). 
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Concerning independent claim 26, Kuhn et a/, discloses an N-best processor 
selects the N-best candidates based upon associated scores ("elements of uncertainty") 
by a plurality of passes (column 7, line 60 to column 8, line 5); Kanevsky et a/, teaches 
personal characteristics class directed to sex, age, profession or personal profile 
(column 4, lines 42 to 44), containing customer related information such as the 
customer's buying habits, buying needs, and customer's profession (column 5, lines 44 
to 53) ("user-specific context information"), where an N-best list (column 4, lines 7 to 14) 
narrows the set of ambiguities to select a final decoding output from a narrowed set of 
intermediate decoding alternatives (column 5, lines 3 to 13: Figure 2). 

Concerning claims 9 and 18, similar considerations apply as to independent 
claim 26. 

Response to Arguments 

3. Applicant's arguments filed 13 July 2005 have been fully considered but they are 
not persuasive. 

Firstly, Applicant argues that there is no motivation to combine Kuhn et ai in view 
of Kanevsky et ai Applicant admits a motivation is provided that "it would have been 
obvious to one having ordinary skill in the art to utilize the system and method for 
resolving decoding ambiguity to iteratively eliminate phrases until a final phrase is 
obtained as taught by Kanevsky et ai in the multi-modal dialog until of Kuhn et ai for 
the purpose of improving language decoding performance and accuracy." However, 
Applicant submits the motivation is purely conclusory and that no motivation exists in 
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either reference for such a combination, and the simple fact that both references 
generally discuss speech does not suggest a motivation to combine the references. 
This position is not convincing. 

The cited motivation is expressly stated by Kanevsky et a/., and provides a prima 
facie reason for combining Kuhn et a/, and Kanevsky et al. Applicant's argument 
represents a mere allegation of patentability, and a denial of the validity of the 
combination, without providing a rationale as to why the expressly stated motivation is 
deficient. Applicant's position can be construed as attacking the references individually 
without addressing the basis of the combination. One cannot show nonobviousness by 
attacking references individually where the rejections are based on combinations of 
references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & 
Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). A motivation to improve language 
decoding performance and accuracy by providing a method of resolving ambiguities in 
language recognition, as expressly stated by Kanevsky et a/, at Column 1 , Lines 50 to 
60, constitutes a reason for combination, and a rationale for prima facie obviousness. 

Moreover, Kuhn et a/, and Kanevsky et al. do more than simply both discuss 
speech. Both Kuhn et al. and Kanevsky et al. represent systems and methods for 
speech recognition involving interactive dialogues. Both Kuhn et al. and Kanevsky et al. 
provide methods for improving speech recognition. Kuhn et al. discloses a method of 
improving speech recognition performance by a set of context-sensitive grammars and 
a dialog history. Kanevsky et al. discloses a method of improving speech recognition 
performance by resolving ambiguities with questions directed to alternative classes 
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based upon personal characteristics. In fact, Kuhn et al. asks questions during a 
dialogue, including questions about viewing time, to determine whether the state of the 
dialogue is time-oriented, to resolve ambiguities during speech recognition, in a manner 
analogous to Kanevsky et al. (Column 7, Lines 20 to 26) Thus, Kuhn et a/, and 
Kanevsky et al. providing cumulative methods for improving speech recognition 
performance involving interactive dialogues. 

Secondly, Applicant argues that the combination of Kuhn et al. and Kanevsky et 
al. does not teach all of the elements of independent claims 6, 15, and 26. Specifically, 
Applicant maintains that Kuhn et al. does not teach the element of "when more than one 
phrase is recognized as the digitized voice data provided by the user as a result of 
voice-recognition uncertainty, using user-specific context information to choose a 
recognized phrase from the one or more phrases recognized as the digitized voice 
data." Applicant says that nothing in Kuhn et al. makes any reference to "voice 
recognition uncertainty". Instead, Applicant states that Kuhn et al. generates a 
confidence vector, i.e. a measure of how well the words in the input sentence were 
recognized. This position is traversed. 

A voice recognition uncertainty is equivalent to a measure of how well words 
were recognized. Kuhn et al. discloses a word confidence vector to determine how well 
each word in an input sentence was recognized. A high value for a word confidence 
vector corresponds to a low voice recognition uncertainty, and a low value for a word 
confidence vector corresponds to a high voice recognition uncertainty. Then, Kuhn et 
al. generates a score for a phrase or sentence by combining together all of the word 
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confidence vectors for individual words, as weighted by dialogue context weights. 
(Column 7, Lines 18 to 59) A list of N-best candidates is then produced, and the 
process iterates to feed back to a next pass, corresponding to Applicant's "when more 
than one phrase is recognized as the digitized voice data provided by the user as a 
result of voice-recognition uncertainty". (Column 7, Line 60 to Column 8, Line 5) Thus, 
scores and confidence vectors represent a relative certainty or uncertainty of phrases 
and words, respectively. 

Finally, Applicant argues that Kuhn et al. does not teach the use of "user-specific 
context information". Applicant says that when Kuhn et al. asks the user about a 
particular topic to generate context weights, this is not based on elements of 
uncertainty, as claimed. Applicant states that nothing in Kuhn et al. teaches or suggests 
a selection of user-specific context information, and that Kanevsky et al. does not teach 
or suggest this element. This position is not persuasive. 

Both Kuhn et al. and Kanevsky et al. disclose aspects of "user-specific context 
information" to resolve uncertainty. Kuhn et al. generates a time question to provide 
dialogue context weights to resolve uncertainty, and maintains a dialogue history to 
resolve uncertainty based upon what a specific user has already said. Kanevsky et al. 
discloses resolving ambiguity by asking questions about decoding alternatives, where 
the decoding alternatives are based up personal characteristics found in a user's 
personal profile. (Column 4, Lines 42 to 45) Thus, both Kuhn et al. and Kanevsky et al. 
teach resolving uncertainty based upon user-specific context information. 
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Therefore, the rejection of claims 6, 7, 9, 15, 16, 18, and 26 under 35 U.S.C. 
103(a) as being unpatentable over Kuhn et al. in view of Kanevsky et a/, is proper. 

Conclusion 

4. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .1 36(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lemer whose telephone number is (571 ) 272- 
7608. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 703- 
872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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